Search results for " clustering"
showing 10 items of 312 documents
Estimating Missing Information by Cluster Analysis and Normalized Convolution
2018
International audience; Smart city deals with the improvement of their citizens' quality of life. Numerous ad-hoc sensors need to be deployed to know humans' activities as well as the conditions in which these actions take place. Even if these sensors are cheaper and cheaper, their installation and maintenance cost increases rapidly with their number. We propose a methodology to limit the number of sensors to deploy by using a standard clustering technique and the normalized convolution to estimate environmental information whereas sensors are actually missing. In spite of its simplicity, our methodology lets us provide accurate assesses.
Hierarchical networks of food exchange in the black garden ant Lasius niger
2020
In most eusocial insects, the division of labour results in relatively few individuals foraging for the entire colony. Thus, the survival of the colony depends on its efficiency in meeting the nutritional needs of all its members. Here, we characterise the network topology of a eusocial insect to understand the role and centrality of each caste in this network during the process of food dissemination. We constructed trophallaxis networks from 34 food-exchange experiments in black garden ants (Lasius niger). We tested the influence of brood and colony size on (i) global indices at the network level (i.e. efficiency, resilience, centralisation and modularity) and (ii) individual values (i.e. …
Fuzzy quantification of common and rare species in ecological communities (FuzzyQ)
2021
International audience; Most species in ecological communities are rare, whereas only a few are common. This distributional paradox has intrigued ecologists for decades but the interpretation of species abundance distributions remains elusive.We present Fuzzy Quantification of Common and Rare Species in Ecological Communities (FuzzyQ) as an R package. FuzzyQ shifts the focus from the prevailing species-categorization approach to develop a quantitative framework that seeks to place each species along a rarity-commonness gradient. Given a community surveyed over a number of sites, quadrats, or any other convenient sampling unit, FuzzyQ uses a fuzzy clustering algorithm that estimates a probab…
Quantum clustering in non-spherical data distributions: Finding a suitable number of clusters
2017
Quantum Clustering (QC) provides an alternative approach to clustering algorithms, several of which are based on geometric relationships between data points. Instead, QC makes use of quantum mechanics concepts to find structures (clusters) in data sets by finding the minima of a quantum potential. The starting point of QC is a Parzen estimator with a fixed length scale, which significantly affects the final cluster allocation. This dependence on an adjustable parameter is common to other methods. We propose a framework to find suitable values of the length parameter σ by optimising twin measures of cluster separation and consistency for a given cluster number. This is an extension of the Se…
SpCLUST: Towards a fast and reliable clustering for potentially divergent biological sequences
2019
International audience; This paper presents SpCLUST, a new C++ package that takes a list of sequences as input, aligns them with MUSCLE, computes their similarity matrix in parallel and then performs the clustering. SpCLUST extends a previously released software by integrating additional scoring matrices which enables it to cover the clustering of amino-acid sequences. The similarity matrix is now computed in parallel according to the master/slave distributed architecture, using MPI. Performance analysis, realized on two real datasets of 100 nucleotide sequences and 1049 amino-acids ones, show that the resulting library substantially outperforms the original Python package. The proposed pac…
Retrospective Proteomic Screening of 100 Breast Cancer Tissues.
2017
The present investigation has been conducted on one hundred tissue fragments of breast cancer, collected and immediately cryopreserved following the surgical resection. The specimens were selected from patients with invasive ductal carcinoma of the breast, the most frequent and potentially aggressive type of mammary cancer, with the objective to increase the knowledge of breast cancer molecular markers potentially useful for clinical applications. The proteomic screening; by 2D-IPG and mass spectrometry; allowed us to identify two main classes of protein clusters: proteins expressed ubiquitously at high levels in all patients; and proteins expressed sporadically among the same patients. Wit…
FragClust and TestClust, two informatics tools for chemical structure hierarchical clustering analysis applied to lipidomics. The example of Alzheime…
2016
Lipidomic analysis is able to measure simultaneously thousands of compounds belonging to a few lipid classes. In each lipid class, compounds differ only by the acyl radical, ranging between C10:0 (capric acid) and C24:0 (lignoceric acid). Although some metabolites have a peculiar pathological role, more often compounds belonging to a single lipid class exert the same biological effect. Here, we present a lipidomics workflow that extracts the tandem mass spectrometry data from individual files and uses them to group compounds into structurally homogeneous clusters by chemical structure hierarchical clustering analysis (CHCA). The case-to-control peak area ratios of the metabolites are then a…
A clustering package for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Model.
2018
International audience; In this article, a new Python package for nucleotide sequences clustering is proposed. This package, freely available on-line, implements a Laplacian eigenmap embedding and a Gaussian Mixture Model for DNA clustering. It takes nucleotide sequences as input, and produces the optimal number of clusters along with a relevant visualization. Despite the fact that we did not optimise the computational speed, our method still performs reasonably well in practice. Our focus was mainly on data analytics and accuracy and as a result, our approach outperforms the state of the art, even in the case of divergent sequences. Furthermore, an a priori knowledge on the number of clust…
Autoimmune polyglandular diseases.
2019
Autoimmune polyglandular diseases (APD) are defined as the presence of two autoimmune -induced endocrine failures. With respect to the significant morbidity and potential mortality of APD, the diagnostic objective is to detect APD at an early stage, with the advantage of less frequent complications, effective therapy and better prognosis. This requires that patients at risk be regularly screened for subclinical endocrinopathies prior to clinical manifestation. Regarding the time interval between manifestation of first and further endocrinopathies, regular and long-term follow-up is warranted. Quality of life and psychosocial status are poor in APD patients and involved relatives. Familial c…
Innovative Strategies to Develop Chemical Categories Using a Combination of Structural and Toxicological Properties.
2016
Interest is increasing in the development of non-animal methods for toxicological evaluations. These methods are however, particularly challenging for complex toxicological endpoints such as repeated dose toxicity. European Legislation, e.g., the European Union's Cosmetic Directive and REACH, demands the use of alternative methods. Frameworks, such as the Read-across Assessment Framework or the Adverse Outcome Pathway Knowledge Base, support the development of these methods. The aim of the project presented in this publication was to develop substance categories for a read-across with complex endpoints of toxicity based on existing databases. The basic conceptual approach was to combine str…